Improved HMM Separation for Distant-Talking Speech Recognition

نویسندگان

Tetsuya Takiguchi

Masafumi Nishimura

چکیده

In distant-talking speech recognition, the recognition accuracy is seriously degraded by reverberation and environmental noise. A robust speech recognition technique in such environments, HMM separation and composition, has been described in [1]. HMM separation estimates the model parameters of the acoustic transfer function using adaptation data uttered from an unknown position in noisy and reverberant environments, and HMM composition builds an HMM of noisy and reverberant speech, using the acoustic transfer function estimated by HMM separation. Previously, HMM separation has been applied to the acoustic transfer function based on a single Gaussian distribution. However the improvement was smaller than expected for the impulse response with long reverberations. This is because the variance of the acoustic transfer function in each frame increases, since the length of the impulse response of the room reverberation is longer than that of the spectral analysis window. In this paper, HMM separation is extended to estimate the acoustic transfer function based on the Gaussian mixture components in order to compensate for the greater variability of the acoustic transfer function, and the re-estimation formulae are derived. In addition, this paper introduces a technique to adapt the noise weight for each mel-spaced frequency in order to improve the performance of the HMM separation in the linear-spectral domain, since the use of the HMM separation in the linear-spectral domain sometimes causes a negative mean output due to the subtraction operation. The extended HMM separation is evaluated on distant-talking speech recognition tasks. The results of the experiments clarify the effectiveness of the proposed method. key words: distant-talking speech recognition, HMM separation, reverberation, noise

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

HMM-separation-based speech recognition for a distant moving speaker

This paper presents a hands-free speech recognition method based on HMM composition and separation for speech contaminated not only by additive noise but also by an acoustic transfer function. The method realizes an improved user interface such that a user is not encumbered by microphone equipment in noisy and reverberant environments. The use of HMM composition has already been proposed for co...

متن کامل

Speech recognition in a reverberant environment using matched filter array (MFA) processing and linguistic-tree maximum likelihood linear regression (LT-MLLR) adaptation

Performance of automatic speech recognition systems trained on close talking data su ers when used in a distant talking environment due to the mismatch in training and testing conditions Microphone array sound capture can reduce some mismatch by removing ambi ent noise and reverberation but o ers insu cient im provement in performance However using array sig nal capture in conjunction with Hidd...

متن کامل

Speech recognition for a distant moving speaker based on HMM composition and separation

This paper describes a hands-free speech recognition method based on HMM composition and separation for speech contaminated not only by additive noise but also by an acoustic transfer function. The method re alizes an improved user interface such that a user is not encumbered by microphone equipment in noisy and re verberant environments. In this approach， an attempt is made to model acoustic...

متن کامل

An evaluation of adaptive beamformer based on average speech spectrum for noisy speech recognition

Distant-talking speech recognition in noisy environments is indispensable for self-moving robots or tele-conference systems. However, background noise and room reverberations seriously degrade the sound-capture quality in real acoustic environments. A microphone array is an ideal candidate as an effective method for capturing distant-talking speech. AMNOR (Adaptive Microphone-array for NOise Re...

متن کامل

Matching the Acoustic Model to Front-End Signal Processing for ASR in Noisy and Reverberant Environments

Distant-talking automatic speech recognition (ASR) represents an extremely challenging task. The major reason is that unwanted additive interference and reverberation are picked up by the microphones besides the desired signal. A hands-free human-machine interface should therefore comprise a powerful acoustic preprocessing unit in line with a robust ASR back-end. However, since perfect speech e...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

IEICE Transactions

دوره 87-D شماره

صفحات -

تاریخ انتشار 2004

Improved HMM Separation for Distant-Talking Speech Recognition

نویسندگان

چکیده

منابع مشابه

HMM-separation-based speech recognition for a distant moving speaker

Speech recognition in a reverberant environment using matched filter array (MFA) processing and linguistic-tree maximum likelihood linear regression (LT-MLLR) adaptation

Speech recognition for a distant moving speaker based on HMM composition and separation

An evaluation of adaptive beamformer based on average speech spectrum for noisy speech recognition

Matching the Acoustic Model to Front-End Signal Processing for ASR in Noisy and Reverberant Environments

عنوان ژورنال:

اشتراک گذاری